Videorealistic talking faces: a morphing approach

نویسندگان

Tony Ezzat

Tomaso A. Poggio

چکیده

We present a method for the construction of a videorealistic text-to-audiovisual speech synthesizer. A visual corpus of a subject enunciating a set of key words is initially recorded. The key words are chosen so that they collectively contain most of the American English viseme images, which are subsequently identified and extracted from the data by hand. Next, using optical flow methods borrowed from the computer vision literature, we compute realistic transitions between every viseme to every other viseme. The images along these transition paths are generated using a morphing method. Finally, we exploit phoneme and timing information extracted from a text-tospeech synthesizer to determine which viseme transitions to use, and the rate at which the morphing process should occur. In this manner, we are able to synchronize the visual speech stream with the audio speech stream, and hence give the impression of a videorealistic talking face.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Personalized Talking/moving Head Presentation Using Image-based Transformations

This paper addresses the problem of multimedia presentation of a moving and talking head. Existing approaches either use a 3D geometrical representation or multiple 2D views to model and reconstruct faces. To achieve realistic appearance and personalization, and avoid complex models, we propose two approaches. First, we show how pure image-based view morphing can be used to create new moving an...

متن کامل

Near-videorealistic synthetic visual speech using non-rigid appearance models

In this paper we present work towards videorealistic synthetic visual speech using non-rigid appearance models. These models are used to track a talking face enunciating a set of training sentences. The resultant parameter trajectories are used in a concatenative synthesis scheme, where samples of original data are extracted from a corpus and concatenated to form new unseen sequences. Here we e...

متن کامل

A Cantonese Speech-Driven Talking Face Using Translingual Audio-to-Visual Conversion

This paper proposes a novel approach towards a videorealistic, speech-driven talking face for Cantonese. We present a technique that realizes a talking face for a target language (Cantonese) using only audio-visual facial recordings for a base language (English). Given a Cantonese speech input, we first use a Cantonese speech recognizer to generate a Cantonese syllable transcription. Then we ma...

متن کامل

Prototyping and Transforming Visemes for Animated Speech

Animated talking faces can be generated from a set of predefined face and mouth shapes (visemes) by either concatenation or morphing. Each facial image corresponds to one or more phonemes, which are generated in synchrony with the visual changes. Existing implementations require a full set of facial visemes to be captured or created by an artist before the images can be animated. In this work w...

متن کامل

Text to visual synthesis with appearance models

This paper presents a new method named text to visual synthesis with appearance models (TEVISAM) for generating videorealistic talking heads. In a first step, the system learns a person-specific facial appearance model (PSFAM) automatically. PSFAM allows modeling all facial components (e.g. eyes, mouth, etc) independently and it will be used to animate the face from the input text dynamically. ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1997

Videorealistic talking faces: a morphing approach

نویسندگان

چکیده

منابع مشابه

A Personalized Talking/moving Head Presentation Using Image-based Transformations

Near-videorealistic synthetic visual speech using non-rigid appearance models

A Cantonese Speech-Driven Talking Face Using Translingual Audio-to-Visual Conversion

Prototyping and Transforming Visemes for Animated Speech

Text to visual synthesis with appearance models

عنوان ژورنال:

اشتراک گذاری